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Abstract: 

Many innovative schemes for allocating jobs to parallel computing systems he 
proposed in order to achieve highly utilized parallel computing systems. The s 
have tried to achieve good job response times with little system fragmentatic 
processing resources. Since most schemes have concentrated on approaches 
processor allocation, the schemes have used First-Come-First-Serve (FCFS) 
scheduling discipline. However, it has been previously established that job sc 
algorithms for parallel computing systems can have a large impact on the sys 
utilization and job response time. Schemes that use multiple queues, which 
sequence of jobs allocated to the parallel system, can be very effective in imp 
system performance. However, such non-FCFS schemes have been criticized 
they provide improved average performance by favoring small jobs at the exf 
large jobs. In order to achieve improved performance by means of multiple qi 
scheduling schemes without sacrificing the fairness of FCFS, we propose a nei 
scheduling discipline that behaves in a FCFS manner under low loaded conditi 
exploits performance enhancing features of multiple queue schemes under hi 
conditions. In addition, the scheme does not inappropriately discriminate agai 
jobs 
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Detailed Description Text - DETX (20) : 

As noted above, in connection with step 116 (FIG. 2B) , the task dispatcher 
12 may, after (1) receiving the notification from a processor 10 that it has 
finished processing its assigned task and (2) determining that processing of 
tasks pointed to by all of the entries 15 has been completed, reorder the 
entries 15 in the task identification queue 13 so that one or more of the 
entries 15 relating to the last completed tasks are moved to the beginning of 
the queue 13, so that they may be used to dispatch tasks at the beginning of 
the next iteration. In the following, it will be assumed that only one entry 
15, which points to the last-completed task, is moved to the beginning of the 
task identification queue 13. In this operation, the task dispatcher first 
retrieves the contents of the entry in the task assignment list 16 associated 
with the processor 10 from which it received the notification (step 120) . The 
contents retrieved from the task assignment list 16 points to the entry 15 (i) 
in the task identification queue 13 that, in turn, identified the last finished 
task in the task store 11. The task dispatcher 12 then dequeues the entry 
15 (i) in the task identification queue 13 identified by the contents retrieved 
in step 120 (step 121) and enqueues it at the head of the task identification 
queue 13 (step 122) . 

Claims Text - CLTX (4) : 

C. a task dispatcher comprising means for sequentially dispatching the 
plurality of tasks to said plurality of digital data processors by sequentially 
referring to the plurality of task identification entries in the order that the 
plurality of task identification entries are arranged in said task 
identification queue to retrieve the plurality of tasks, and means for 
reordering the plurality of task identification entries in said task 
identification queue after all of the tasks represented by the plurality of 
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A mechanism is provided for reordering bus transactions to increase bus utilization in a computer system in which a 
split- transaction bus is bridged to a single-envelope bus. In one embodiment, both masters and slaves are ordered, 
simplifying implementation. In another embodiment, the system is more loosely coupled with only masters being 
ordered. Greater bus utilization is thereby achieved. To avoid deadlock, transactions begun on the split-transaction bus 
are monitored. When a combination of transactions would, if a predetermined further transaction were to begin, result 
in deadlock, this condition is detected. In the more tightly coupled system, the predetermined further transaction, if it is 
requested, is refused, thereby avoiding deadlock. In the more loosely- coupled system, the flexibility afforded by 
unordered slaves is taken advantage of to, in the typical case, reorder the transactions and avoid deadlock without 
killing any transaction. Where a data dependency exists that would prevent such reordering, the further transactions is 
killed as in the more tightly-coupled embodiment. Data dependencies are detected in accordance with address- 
coincidence signals generated by slave devices on a cache-line basis. In accordance with a further optimization, at least 
one slave device (e.g., DRAM) generates page-coincidence bits. When two transactions to the slave device are to the 
same address page, the transactions are reordered if necessary to ensure that they are executed one after another without 
any intervening transaction. Latency of the slave is thereby reduced. 
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A peripheral interface circuit for an I/O node of a computer system. A peripheral 
interface circuit for an input/output node of a computer system includes a first 
buffer circuit, a second buffer circuit and a bus interface circuit. The first 
buffer circuit receives packet commands and may include a first plurality of 
buffers each corresponding to a respective virtual channel of a plurality of 
virtual channels. The second buffer circuit is coupled to receive packet commands 
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stored in the first buffer circuit into commands suitable for transmission on a 
peripheral bus. 

27 Claims, 13 Drawing figures 



h eb bgeeef c ec ghe e ge 



ABSTRACT: 



Record Display Form 



Page 1 of 3 



First Hit Fwd Refs 



n f Generate Collection^ 

^ n i m¥ TTT l T;wi m ^v,. : .^i.. l .-..^i. . . ... ^ ^ ■ . ■ '■ .... ^. .■ 1 . I 



Brint 



L7: Entry 4 of 11 



File: USPT 



Feb 10, 2004 



US-PAT-NO: RE38428 

DOCUMENT-IDENTIFIER: US RE38428 E 



TITLE: Bus transaction reordering in a computer system having unordered slaves 



DATE-ISSUED: February 10, 2004 



I N VENT OR- 1 N FORMAT I ON : 
NAME 

Kelly; James D. 
Regal; Michael L. 



CITY 

Scotts Valley 
Pleasanton 



STATE 

CA 

CA 



ZIP CODE 



COUNTRY 



ASSIGNEE-INFORMATION: 
NAME 

Apple Computer, Inc. 



CITY 

Cupertino 



STATE ZIP CODE 
CA 



COUNTRY 



TYPE CODE 
02 



APPL-NO: 10/ 006939 [PALM] 
DATE FILED: November 30, 2001 



REISSUE-DATA: 
US-PAT-NO 
05996036 



DATE-ISSUED 
November 30, 19 99 



APPL-NO 
779632 



DATE-FILED 
January 7, 1997 



PARENT-CASE: 

.Iadd.This application is a continuation-in-part of U.S. patent application Ser 
No. 08/432,622, filed May 2, 1995, now abandoned Iaddend . 

INT-CL: [07] G06 F 9/46, G06 F 13/3_6, Gil C 7/00 

US-CL-ISSUED: 710/110; 710/107, 709/208, 370/402 
US-CL-CURRENT: 710/110; 370/402, 709/208, 710/107 

FIELD-OF-SEARCH: 710/110, 710/107, 710/263, 710/41, 710/52, 710/311, 709/100, 
709/208, 714/47, 711/151, 370/402 

PRIOR-ART-DISCLOSED : 

U.S. PATENT DOCUMENTS 



J ... ... ■■ j«» ■ ' ■ . "... . . ' ' ■ 1 J ' ' 

y ■ C " " - I * * » ■ — ■-— — t - ■ B jto>>: 



Search Selected 



^Search ALL Clear 



PAT -NO 
4181974 



ISSUE-DATE 
January 1980 



PATENTEE-NAME 
Lemay et al . 



US-CL 



e b 



b g e e e f 



e c ghe 



Record Display Form Page 2 of 3 



□ 


4473880 


September 1984 


RnHHp ot" ^» 1 
j_> nmxt; t. l dl • 


! ! 


4494193 


Januarv 1985 


Rr-ahlTl o f- =s "1 
OJ_dilIlL t; U O.X ■ 


V. 


4965716 


October 1990 


Quppn P \7 
owcciic y 




5006982 


April 1991 


Ebprsol p pf ^] 

J-J ^ _i_ v# l_- — 1— * 




5191649 


March 1993 


Cadambi et ^1 




5257356 


October 1993 


Brockmann et al 


□ 


5287477 


February 1994 


Johnson et 

W V-/ Jill »w» V-/ X X LI -J— « 


i.. ...s 


5305442 


April 1994 


PpHprcpn <^-|- -a "I 


□ 


5307505 


April 1994 


Hon 1 hprrr o t" ^ 1 


n 


5327538 


July 1994 


Hairiscrnrhi p1~ ^1 




5327570 


July 1994 


Foster et al 




5333276 


July 1994 


Solar i 


□ 


5345562 


September 1994 


Chen 


n 


5355455 


October 1994 


Hi 1 npnHnrf ^ +- "1 


n 


5363485 


November 1994 


Nlnn \/ot~i o +■ r» "1 
iNy uycii cL a. X . 


n 

i i 


5369748 


November 1994 




□ 


5375215 


December 1994 


na.iia.vVci cL d-L « 


• * 


5418914 


May 1995 


Hpi 1 cif- a] 




5442763 


Auaust 1995 


J— ' a -i_ u J- a -i_ CL a. X » 


n 

i 


5469435 


November 1995 


Krp "i n t ^1 




5473762 


December 1995 


iv-lcxh c i_ ax. 




5542056 


July 1996 


Ja f f a p t" a 1 




5544332 


August 1996 


Chen 




5546546 


August 1996 


Bell et al 


□ 


5592631 


Januarv 1997 


Ke 1 1 v p f a 1 
nc±iy c- u ax > 


n 

p * 


5592670 


January 1997 


PI pf ph P T 


133 


5615343 


March 1997 


Saranacihar pt* al 


n 

1 i 


5680402 


October 1997 


Ol nnwi clr\ o f - a 1 
vxiiuirvxi i i c i_ ax • 


□ 


5682512 


October 1997 


Tetrick 


1 1 


^ 1 C\ Q 1 Q A 


January 1998 


Parks et al. 




5822772 


October 1998 


Chan et al . 




5930485 


July 1999 


Kelly 




5933612 


August 1999 


Kelly et al. 



ART-UNIT : 2181 

PRIMARY-EXAMINER: Ray; Gopal C. 



h eb bgeeef c ec ghe 



e ge 



Record Display Form 



Page 3 of 3 



ATTY-AGENT-FIRM: Fenwick & West LLP 
ABSTRACT: 

A mechanism is provided for reordering bus transactions to increase bus utilization 
in a computer system in which a split-transaction bus is bridged to a single- 
envelope bus. In one embodiment, both masters and slaves are ordered, simplifying 
implementation. In another embodiment, the system is more loosely coupled with only 
masters being ordered. Greater bus utilization is thereby achieved. To avoid 
deadlock, transactions begun on the split-transaction bus are monitored. When a 
combination of transactions would, if a predetermined further transaction were to 
begin, result in deadlock, this condition is detected. In the more tightly coupled 
system, the predetermined further transaction, if it is requested, is refused, 
thereby avoiding deadlock. In the more loosely-coupled system, the flexibility 
afforded by unordered slaves is taken advantage of to, in the typical case, reorder 
the transactions and avoid deadlock without killing any transaction. Where a data 
dependency exists that would prevent such reordering, the further transactions is 
killed as in the more tightly-coupled embodiment. Data dependencies are detected in 
accordance with address-coincidence signals generated by slave devices on a cache- 
line basis. In accordance with a further optimization, at least one slave device 
(e.g., DRAM) generates page-coincidence bits. When two transactions to the slave 
device are to the same address page, the transactions are reordered if necessary to 
ensure that they are executed one after another without any intervening 
transaction. Latency of the slave is thereby reduced. 

19 Claims, 27 Drawing figures 
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File: USPT 



Nov 30, 1999 



DOCUMENT- IDENTIFIER : US 5996036 A 

TITLE: Bus transaction reordering in a computer system having unordered slaves 
Abstract Text (1) : 

A mechanism is provided for reordering bus transactions to increase bus utilization 
in a computer system in which a split-transaction bus is bridged to a single- 
envelope bus. In one embodiment, both masters and slaves are ordered, simplifying 
implementation. In another embodiment, the system is more loosely coupled with only 
masters being ordered. Greater bus utilization is thereby achieved. To avoid 
deadlock, transactions begun on the split-transaction bus are monitored. When a 
combination of transactions would, if a predetermined further transaction were to 
begin, result in deadlock, this condition is detected. In the more tightly coupled 
system, the predetermined further transaction, if it is requested, is refused, 
thereby avoiding deadlock. In the more loosely-coupled system, the flexibility 
afforded by unordered slaves is taken advantage of to, in the typical case, reorder 
the transactions and avoid deadlock without killing any transaction. Where a data 
dependency exists that would prevent such reordering, the further transactions is 
killed as in the more tightly-coupled embodiment. Data dependencies are detected in 
accordance with address-coincidence signals generated by slave devices on a cache- 
line basis. In accordance with a further optimization, at least one slave device 
(e.g., DRAM) generates page-coincidence bits. When two transactions to the slave 
device are to the same address page, the transactions are reordered if necessary to 
ensure that they are executed one after another without any intervening 
transaction. Latency of the slave is thereby reduced. 

Brief Summary Text (20) : 

A mechanism is provided for reordering bus transactions to increase bus utilization 
in a computer system in which a split-transaction bus is bridged to a single- 
envelope bus. In one embodiment, both masters and slaves are ordered, simplifying 
implementation. In another embodiment, the system is more loosely coupled with only 
masters being ordered. Greater bus utilization is thereby achieved. In accordance 
with one embodiment of the invention, a queuing structure includes multiple master 
queues and multiple slave queues. The queuing structure receives bus grant signals 
and respective slave acknowledge signals from respective slave devices. Each time 
an address bus grant is issued a record is entered in the queuing structure, the 
record comprising a first entry in a master queue identified by the address bus 
grant signals, and a second entry in a slave queue identified by the slave 
acknowledge signals. The first entry identifies a target slave device in accordance 
with the slave acknowledge signals, and the second entry identifies an originating 
master device in accordance with the address bus grant signals. A matching circuit 
is responsive to queue entries from the queuing structure for producing match bits 
identifying selected records the first entry of which is at the head of a master 
queue. A data arbitration circuit is responsive to the match bits and to queue 
entries from the queuing structure for generating data bus grant signals for the 
master devices and for generating for each slave device a multibit signal which 
when active identifies a transaction within the transaction queue of the slave 
device . 



Detailed Description Text (19) : 

The manner in which the foregoing signals are used to decouple address tenures and 
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data tenure may be appreciated with reference to FIG. 5. For simplicity, the 
address arbitration phase has not been illustrated. The address transfer phase is 
essentially the same as in the conventional case. The address termination phase, 
however, differs. The addressed slave asserts the AACK signal in the conventional 
manner, the AACK signal being used by the master. In parallel with AACK, the 
addressed slave generates a SACK signal for use by the arbiter 600. The arbiter 
uses this information about which slave has acknowledged in order to reorder 
transactions on the system bus 204. 

Detailed Description Text (147) : 

In the case of some slave devices, the average latency of the slave device may be 
reduced by reordering transactions involving the slave device. In the case of DRAM, 
for example, page mode reads take less time than non-paged reads. Hence, in the 
embodiment of FIG. 19, the ArbDatSM 604 1 further receives page coincidence (PC) 
signals from at least one slave device, i.e., DRAM. The ArbDatSM block 604 f 
receives from the slave device a separate signal for every possible pair of queue 
entries within the slave device, indicating whether the targets of both 
transactions queued within the pair of queue entries are within the same page. The 
ArbDatSM block 604' therefore receives Q(Q+l)/2 total page coincidence bits or, in 
the illustrated embodiment, 2 . times . 3/2=3 bits. 

Detailed Description Text (165) : 

As previously described in relation to FIG. 6, slave ordering is a major cause of 
deadlock. When what would otherwise be a deadlocking transaction is detected, it is 
"killed" by issuing an ARtry signal. Without slave ordering, a large proportion of 
what would otherwise be deadlocking transactions, instead of being killed, can now 
be accepted and reordered in relation to other transactions so as to avoid 
deadlock. Such reordering is not possible, however, when a data dependency exists. 
For example, a read of one data location by one device followed by a write of the 
same data location by another device does not yield the same result as if the 
execution order is reversed. If a deadlock situation cannot be avoided by 
transaction reordering because of a data dependency, the need remains to kill the 
deadlocking transaction. 

Detailed Description Text (177) : 

Referring to FIG. 25, the inputs and outputs of the ARtryGen block 613' are 
illustrated in greater detail. The inputs along the top and left edges of the 
ARtryGen block 613' remain unchanged compared to the ARtryGen block 613 of FIG. 6. 
Unlike the ARtryGen block 613 of FIG. 6, however, the ARtryGen block 613', instead 
of generating ARtry based on the assumption of ordered slaves , uses certain 
deadlock address-coincidence (DLAC) inputs received at the bottom edge of the block 
to generate a "qualified" ARtry signal only when a data dependency prevents 
transactions from being reordered so as to avoid the deadlock. The slave devices 
each monitor each system bus address tenure and compare the address placed on the 
bus to addresses queued within the respective slave devices. If the address on the 
bus is the same as an address already queued within the slave device, the slave 
device raises its DLAC signal to the ARtryGen block 613 ' . All slave devices or only 
selected slave devices (most importantly DRAM) may monitor the bus and signal the 
ARtryGen block 613 1 in this manner. In the illustrated embodiment, all slave 
devices are assumed to provide a DLAC signal. The ARtryGen block 613 1 therefore 
receives signals DLAC. sub. 0 through DLAC . sub . (S + l ) . 

Current US Original Classification (1) : 
710/110 

Current US Cross Reference Classification ( 1 ) ; 
709/208 

Current US Cross Reference Classification (2) : 
710/107 
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CLAIMS : 

1 . In a computer system having a system bus and having arbitration circuitry, 
multiple master devices including a system microprocessor, and multiple slave 
devices, all coupled to the system bus, a method of reordering system bus 
transactions , comprising the steps of: 

receiving and queuing within a particular slave device a plurality of transactions; 

within said arbitration circuitry, arbitrating between pending transactions based 
on arbitration policies including an arbitration policy that responses are received 
by respective master devices in the same order as requests were issued by the 
respective master devices; and 

at least some of the time, said arbitration circuitry, without signalling said 
microprocessor, signalling said particular slave device such that the system bus is 
granted for a later queued transaction within said particular slave device prior to 
being granted for an earlier queued transaction. 
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ABSTRACT 



Amecarassm is provided fat reordering hus trsnsacUoDS to 
incrcitc bus utQiztcou in a oaraputer syiitm In which a 
epl ti-u aawciiot; but Is bddgod to a single-envelope bra. In 
one embodiment, both master* and Blares arc ordered, 
simplifying impler&eaiaiinn. In another embodiment, the 
system is marc loosely coupled with only maaisrs being 
ordered. Ofeiter ha uttfizatiofl is thereby achieved. Tb 

avoid deadlock, transactions began on the split-lraojaction 
bus are moniloced. When e combination of transactions 
would, if a predetermined farther transaction we« m begin, 
result in deadlock, this condition ia detected. In the more 
tightly coupled system , the predetermined further 
tzansaction, IE it is requested, is rethsed, thereby avoiding 
deadio^c In the mere loo acly-ocapled system, the flex ibility 
affiwded by unordered akvel is uk«n advantage of to, ia 
typical cess, reorder me transactmns and avoid deadlock 
without killing any transaction. Where a data dependency 
exists that wo did prevent such reordering, the further trans- 
actions iski&d as in the more tightly-coupled embodiment. 
Data dependencies are detected in accordance with addresa- 
coincidenne signals generated by slave devices on a cache- 
Hae baaj. In acoordapce vAih a further optimization, ai kasl 
one giro device (e.g^PRAM) generates page-coincidence 
bits. When two transactions to the slave device are to the 
same address page, the UsnsactKXES axe reordered if neces- 
sary to ensure thai they are ezecnied one after another 
without any intervening transaction. Latency of the slave is 
thereby reduced. 

19 Claims, 21 Drawing &hml» 
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Detailed Description Text - DETX (11) : 

FIG. 3 also shows queue 24 0 to receive an entire line of read data from 
agent or device 180, 185, or 190. In one example, queue 240 is a line-size 
buffer. As noted above, address segmentation unit 220 stores the address bits 
for ordering the line of read data according to the transaction requested by 
processor 100. Once the entire line of read data is present in queue 240, 
steering logic, for example, in state machine 230 is employed to reconfigure 
data based on the address transaction requested. If the transaction requested 
was a linear line read (e.g., A[4:3]# is 00b), the read data is returned as 
linear line data. Conversely, if the transaction requested is a non-linear 
line read (e.g., A[4:3]# is 01b, 10b, or lib), the configuring address is 
associated with the read data and the data returned as non- linear line data. 
One way this modification may be done is by utilizing multiplexer 250 coupled 
to queue 24 0 and controlling the output to processor 100. For example, a state 
machine may be utilized such that when a transaction is linear, multiplexor 250 
does not reorder the line data. When the transaction is non-linear, state 
machine 230 utilizes mutliplexer 250 to reorder the line data prior to 
forwarding to the processor. 
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Detailed Description Text - DETX (192): 

The local processor 514 continues looping through the flowcharts illustrated 
in FIGS. 14c and 14d, immediately reordering errored books where appropriate 
(blocks 678 and 680); sending new reorder messages to the global reorder 
processor 510 when necessary (blocks 678 and 682) ; requesting global reorders 
after every good book that represents the end of a carrier route or change in 
the postal code when no global reorders are pending in the local queue (i.e., 
the global reorder flag is false) (blocks 636, 656, 658, and 660); and 
obtaining new GRT_Seg_IDs from the global reorder processor whenever a sub- job 
is completed (block 67 4) until the press controller 514 determines that the 
last book in the book ticket and the last global reorder book sent from the 
global reorder processor 510 have been printed (block 630). At such a point, 
control advances to block 631, and the local processor 514 waits for an 
operator command to pull a new book ticket to restart the process, or to call 
the process outlined in FIG. 19 and discussed above. 
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DOCUMENT- IDENTIFIER : US RE38428 E 

TITLE: Bus transaction reordering in a computer system having unordered slaves 
Brief Summary Text (12): 

In a traditional pipelined implementation, data bus tenures are kept in strict 
order with respect to address tenures. However, external hardware can further 
decouple the address and data buses, allowing the data tenures to occur out of 
order with respect to the address tenures. Second-generation PowerPC computers 
include computers whose architecture was especially designed for high performance 
and that incorporated such hardware. This architecture supports true split-bus 
operation with ordered slaves and ordered masters. "Ordered" means each master and 
each slave has its own independent FIFO structure supporting " ordered" service to 
transactions posted to it. If a slave receives three transactions A, B, and C, then 
it will respond to A first, B second, and C third. If a master performs 
transactions D, E, and F, then it expects servicing of those transactions in the 
order of D first, E second, and F third. There can be up to a selected number of 
outstanding master/slave pair transactions in the architecture at one time. In one 
preferred embodiment, this selected number is three outstanding pair transactions. 
As a result, in the foregoing architecture, an expansion bridge may concurrently 
have one outstanding slave transaction to it and one outstanding master transaction 
from it. Although ordered masters and slaves, as opposed to unordered masters and 
slaves, provide an overall simplification to system architecture, they can lead to 
deadlocks when there are conflicting completion dependencies. 

Detailed Description Text (15) : 

The architecture of the computer system of FIG. 2 decouples address and data 
tenures such that data bus utilization is increased. This increase in data bus 
utilization allows for higher real-time performance to be achieved. In particular, 
the present invention allows for a true split-bus architecture with ordered slaves 
and ordered masters. "Ordered," in one usage, means each master and each slave has 
its own independent FIFO structure supporting " ordered" service to transactions 
posted to it. If a slave receives three transactions A, B, and C, the it will 
respond to A first, B second, and C third. If a master performs transactions D, E, 
and F, then it expects servicing of those transactions in the order of D first, E 
second, and F third. In one embodiment, there can be up to three outstanding 
master/slave pair transactions at one time. 



Detailed Description Text (19): 

The manner in which the foregoing signals are used to decouple address tenures and 
data tenure may be appreciated with reference to FIG. 5. For simplicity, the 
address arbitration phase has not been illustrated. The address transfer phase is 
essentially the same as in the conventional case. The address termination phase, 
however, differs. The addressed slave asserts the AACK signal in the conventional 
manner, the AACK signal being used by the master. In parallel with AACK, the 
addressed slave generates a SACK signal for use by the arbiter 600. The arbiter 
uses this information about which slave has acknowledged in order to reorder 
transactions on the system bus 204 . 
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Detailed Description Text (43) : 

Assume now that an RDDA is received from Slave 3. Transactions 1, 2 and 3 will 
then, in that order, be granted the bus and will complete. In the foregoing 
example, whereas the address order of the transactions is 1, 2, 3, 4, 5, the data 
order is 4, 5, 1, 2, 3. 

Detailed Description Text (59) : 

The PCI2PCI bridge has become interlocked, and must flush a posted write upstream 
of itself; in this case the write is headed toward the ARBus, and Expansion Bridge 
l's buffers are full and cannot currently accept the write. The first two 
outstanding transactions in this scenario are 1) Master Processor 1 has an 
outstanding read of Expansion Bridge 1, followed by 2) Master Processor 1 has an 
outstanding write to Expansion Bridge 1. The third attempted transaction is a write 
cycle from Expansion Bridge 1 to main memory. However, this write cycle is to 
copyback-cacheable space and causes a snoop hit in Processor l f s cache. Processor 1 
retries Expansion Bridge l's write cycle, but now needs to push the dirty cache 
line to main memory. However, at this point it is unable to push the dirty cache 
line due to its outstanding write to Expansion Bridge 1. With the use of DBWO*, 
Processor 1 could have re -ordered the snoop push write transaction around its 
outstanding read of Expansion Bridge 1 (transaction number 1) . However, the MPC60x 
microprocessor is not capable of re -ordering the snoop push write transaction 
around its own outstanding write. This is a Type-B Lockup, caused by Processor l's 
inability to complete its read due to the PC I 2 PC I bridge's interlocking behavior. 
This deadlock is avoided by having the ARBus arbiter prevent the Processor from 
writing to an expansion bridge if it has an outstanding read of the expansion 
bridge. This will allow the Processor to perform the Snoop Push write transaction 
if required. 

Detailed Description Text (64): 

Following ARBus protocol after an ARTRY*, all masters on the bus deassert their Bus 
Requests to give the Processor a guaranteed window being the only bus requestor. 
This guarantees that the Processor, who normally has lowest ARBus priority, 
acquires the bus next in order to complete a high priority transaction such as a 
Snoop Push. In this case, the ARBus protocol causes the lower priority expansion 
bridge to never receive a Bus Grant due to the higher priority Video requesting the 
ARBus to complete its access. Since the completion of the Video access is dependent 
on the expansion bridge freeing up some buffer space, and since the expansion 
bridge must get the ARBus to complete its access or receive an ARTRY* in order to 
free up PCI Bus 1 to free up buffer space for the Video write to come in, the 
expansion bridge effectively needs higher priority than Video this time. This is a 
Type-C Lockup, and is avoided by having an expansion bridge keep its Bus Request 
asserted the clock following an ARTRY* if it is the source of the ARTRY*. This is 
precisely the protocol the MP60X processor performs to effectively achieve a higher 
priority when necessary. 

Detailed Description Text (97) : 

The description thus far has assumed a system in which both masters and slaves are 
ordered. In particular, the 60X microprocessor assumes that its transactions are 
ordered . As a consequence, master ordering is to some extent ingrained within the 
underlying system architecture. Slave ordering, on the other hand, although it may 
be convenient from an implementation perspective, is not required. Increased 
efficiency may be achieved by relaxing the constraint of slave ordering, thereby 
allowing transaction independence within slaves. To achieve unordered slaves, 
additional information must be exchanged between the slaves and the arbiter. As 
before, this information may be exchanged in the form of additional side-band 
signals not provided for by the MPC60X bus specification. 

Detailed Description Text (99) : 

Considering first the block ArbMux 603', in order to allow for transaction 
independence within slaves, the ArbMux 603' receives as inputs all of the queue 
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entries of all of the slave queues (instead of just all of the front entries as in 
FIG. 6) . Therefore, if the masters are numbered 0 through M, the slaves are 
numbered 0 through S and the queues locations within each slave queue are numbered 
0 through Q, then the ArbMux 603' receives (S+l) (M+l+1) (Q+l) bits of information 
from the slave queues. One of the bits in the expression (M+l+1) is a valid bit 
that allows for a flop-based queue implementation instead of one requiring random- 
access memory. In an exemplary embodiment with S=5, M=4, and Q=2, the number of 
bits received from the slave queues is 6 . times . 6 . times . 3=108 bits. Since masters 
remained ordered, the ArbMux 603' continues to receive only the front entries from 
the master queues, the same as in FIG. 6. In the illustrated embodiment, the ArbMux 
603' receives from the master queues (M+l ) (S+l+1+1 ) =5 . times . 8=40 bits. As in the 
case of the slave queue entries, one of the bits in the expression (S+l+1+1) is a 
valid bit. The extra bit in the expression (S+l+1+1) is a read/write bit as 
described previously. 

Detailed Description Text (107) : 

In FIG. 6, a transaction is allowed to proceed only if it is the frontmost 
transaction of both the master and the slave. The matching queue location within 
the slave is by definition always the frontmost valid queue location within the 
slave. In the case of ArbMux 603 of FIG. 6, therefore, its function is to identify 
masters whose next transaction in order is also the next transaction in order of 
the target slave device. In the case of the ArbMux 603 ' of FIG. 19, slave ordering 
is no longer required. Hence, the function of the ArbMux 603' is to identify for 
each master the queue location within the target slave that matches the frontmost 
transaction of the master. The ArbMux 603' also indicates whether transaction data 
for that queue location is ready. Hence, for each master, two bits, a SlvMatch bit 
and a SlvRdReady bit, are output for each queue location. In the case of master 
M.sub.0, the bit pairs output by the ArbMux 603 1 are designated M.sub.O Q.sub.0, 
M.sub.O Q.sub.l, . . . , M.sub.O Q. sub. (Q+l), and likewise for each succeeding 
master up to and including the last master M.sub.(M+l), the outputs for which are 
M. sub. (M+l) Q.sub.0, M.sub.(M+l) Q.sub.l, . . . , M.sub.(M+l) Q. sub. (Q+l). If a 
master has a valid transaction in its queue, then for the frontmost valid 
transaction, the SlvMatch signal for that master that corresponds to the matching 
target slave queue location will be asserted. If the master has no valid 
transaction in its queue, then no signal is asserted for that master. 

Detailed Description Text (114): 

As previously described in relation to FIG. 6, slave ordering is a major cause of 
deadlock. When what would otherwise be a deadlocking transaction is detected, it is 
"killed" by issuing an ARtry signal. Without slave ordering, a large proportion of 
what would otherwise be deadlocking transactions, instead of being killed, can now 
be accepted and reordered in relation to other transactions so as to avoid 
deadlock. Such reordering is not possible, however, when a data dependency exists. 
For example, a read of one data location by one device followed by a write of the 
same data location by another device does not yield the same result as if the 
execution order is reversed. If a deadlock situation cannot be avoided by 
transaction reordering because of a data dependency, the need remains to kill the 
deadlocking transaction. 

Detailed Description Text (118): 

To take a concrete example, assume that the frontmost queue entry for Master 0 
designates Slave 0. Assume further that the SlvMatch bits for Master 0 are 010, 
indicating that the match is for queue entry 1 of Slave 0. Without taking into 
account the address coincidence bits of Slave 0, the transaction in queue entry 1 
will be executed if Master 0 is the winning master. Now assume that the address 
coincidence bits of Slave 0 are 100, indicating that the transactions within queue 
locations 0 and 1 are directed to the same cache line. A data dependency therefore 
exists between the transactions such that they must be executed in order . To 
prevent the transaction in queue entry 1 from being executed before the transaction 
in queue entry 0, the SlvMatch bits of Master 0 are modified, i.e., changed from 
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010 to 000. The same modification is performed for each arbitration cycle until the 
transaction in queue entry 0 has executed. The address coincidence bits for Slave 0 
will then be 000. The SlvMatch bits of Master 0 then, instead of being modified, 
remain 010 such that the transaction in queue entry 1 may be executed next if 
Master 0 is the winning master. 

Detailed Description Text (120) : 

In operation, the ArbDatSM 604 1 determines to which masters the PC bits will be 
applied, e.g., which masters have a DRAM transaction at the front of their queues, 
in accordance with the SACK vectors at the head of the master queues. The PC bits 
are then used to determine which queue locations cannot have the transactions 
queued therein go next without forfeiting the speed advantage to be gained from 
paged access. In practice, if a PC bit is asserted, the transactions to which the 
PC bit relates will be scheduled for execution prior to any other transactions 
involving the DRAM. In other words, if the DRAM has three transactions queued, two 
of which are to the same page, the execution order will be COINCIDENT, COINCIDENT, 
NON-COINCIDENT, instead of NON-COINCIDENT, COINCIDENT, COINCIDENT, although both 
sequences yield the same speed advantage. In other embodiments, any execution order 
that results in the page-coincident transactions being executed one after another 
without any intervening transaction may be acceptable for purposes of the PC bits. 
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